GNBF5030 Homework 2

Student id: 1155228903

Question 1

  1. First, download orf_trans.fasta (S. cerevisiae S288C) from yeastgenome.org using wget.

  2. Then create a database of these sequences.

  3. In order to find sequences that are similar to others, you need to blastp this queries against itself (using the database created in step2).

  4. At the end, summarize the blast output and filter the proteins that are similar to the other in the yeast exome. Use wc command to show the number of proteins in your output file.

Question 2